Skip to main content

Quick Start

This guide will help you get started with the LW-BenchHub for kitchen manipulation tasks. The framework supports multiple robots, teleoperation data collection, trajectory replay, and reinforcement learning training.

Prerequisites

Before starting, ensure you have properly installed the lw_benchhub environment following the installation instructions. You should have:

  • NVIDIA Isaac Lab properly configured
  • All required dependencies installed via install.sh
  • GPU drivers and CUDA properly set up

Teleoperation Data Collection

LW-BenchHub provides a comprehensive teleoperation system for collecting demonstration data in simulation kitchen environments. You can control various robots using different input devices to perform manipulation tasks.

Quick Start (choose a configuration)

Run the teleoperation interface by specifying a config preset name (e.g., pandaomron or g1-hand):

python ./lw_benchhub/scripts/teleop/teleop_main.py --task_config g1-hand
# or
python ./lw_benchhub/scripts/teleop/teleop_main.py --task_config pandaomron

Tip: If you prefer using teleop.sh, edit the script to set task_config to a valid preset.

Configuration Overview (teleop_base.yml)

The teleoperation system is highly configurable. Here's the default configuration structure:

# =========================
# Basic Settings
# =========================
disable_fabric: false
num_envs: 1
device: cpu
sensitivity: 1.0
step_hz: 50

# =========================
# Robot Settings
# =========================
robot: PandaOmron-Rel
robot_scale: 1.0

# =========================
# Task & Scene Settings
# =========================
task: RobocasaBaseTask
scene_backend: robocasa
task_backend: robocasa
debug_assets: null
layout: robocasakitchen
sources:
- objaverse
- lightwheel
- aigen_objs
object_projects: []
usd_simplify: false
seed: 0

# =========================
# Data Collection Settings
# =========================
record: false
dataset_file: ./datasets/dataset.hdf5
num_demos: 1
enable_debug_log: false
continue_teleop_after_success: false

# =========================
# Teleoperation Device Settings
# =========================
enable_pinocchio: false
teleop_device: keyboard
relative_control: false
first_person_view: false
enable_rendering_optimization: true
enable_multiple_viewports: false
action_delay_async: false
action_delay_time_s: 0
action_delay_type: fixed
action_buffer_size: 100

# =========================
# Video Saving Settings
# =========================
enable_cameras: false
save_video: false
video_save_dir: ./videos
video_fps: 30

# =========================
# Object Placement Retry Settings
# =========================
max_scene_retry: 4
max_object_placement_retry: 3

resample_objects_placement_on_reset: true
resample_robot_placement_on_reset: true

Customizing Your Configuration

You can create your own personal configuration in the configs directory (the config file name must be globally unique across the entire project). Simply add the following line at the top of your new config file:

_base_: teleop_base

This will inherit all settings from teleop_base.yaml, and you can freely modify any content you need. For example, you can override parameters such as robot, layout, record, num_demos, or teleop_device to customize the scene, robot, control device, and more based on your requirements.

If you want to inherit multiple configuration files, simply modify the inheritance syntax as follows:

_base_: [teleop_base, config_A, config_B, ...]

Variable override order follows the list: the later a config appears, the higher its priority.

Available Input Devices

  • keyboard: Default input method with intuitive key mappings
  • dualhandtracking_abs: Two-hand tracking with retargeting (recommended for G1-Hand)
  • vr-controller: Supports standard VR controllers, including PICO and QUEST controllers, allowing robot teleoperation with full key and directional control compatibility.
  • vr-hand: Integrate PICO/QUEST hand tracking capabilities to control the robot
  • so101leader: SO101 master-slave arm control
  • bi_so101leader: Dual-arm SO101 master-slave control
Key notes
  • For dualhandtracking_abs, control is performed using Vision Pro. For Vision Pro configuration and usage, please refer to the IsaacLab documentation: https://isaac-sim.github.io/IsaacLab/main/source/how-to/cloudxr_teleoperation.html#cloudxr-teleoperation
  • For vr-controller and vr-hand, we use Open-TeleVision to connect PICO/QUEST devices. For instructions on how to use Open-TeleVision, please see https://github.com/OpenTeleVision/TeleVision
  • Controller quick guide: Use the trigger to open/close the gripper, the joystick to move the base, and the grip button to toggle base movement mode (rotate in place or translate). By default, the right controller operates the robot. For certain robots (e.g., dual-arm or X7s waist), use the left controller's trigger or joystick for control.

Trajectory Replay

After collecting teleoperation data, you can replay and analyze the trajectories using two different replay modes. This is essential for validating data quality and understanding robot behavior.

State-Based Replay

State replay reproduces the exact state of all objects in the scene, including robot poses, object positions, and articulation states. This provides perfect reproduction of the recorded session.

python ./lw_benchhub/scripts/teleop/replay_demos.py --dataset_file "/path/to/your/dataset.hdf5" --enable_cameras

Use cases:

  • Analyzing manipulation strategies
  • Debugging data collection issues
  • Creating reference videos
  • Understanding object interactions

Action-Based Replay

Action replay tests the robot's ability to reproduce behaviors by replaying only the actions, allowing the simulation to determine the resulting states. This mode tests the robustness of the recorded behaviors.

Joint Target Replay

Replays the computed joint targets, allowing the robot's control system to track these targets:

python ./lw_benchhub/scripts/teleop/replay_action_demo.py \
--dataset_file /path/to/your/dataset.hdf5 \
--replay_mode joint_target \
--enable_cameras

Features:

  • Tests joint-level control directly, bypassing inverse kinematics
  • Validates simulation environment consistency and stability
  • Evaluates precise reproduction of joint trajectories

Action Replay

Replays the high-level actions through the full control pipeline (e.g., inverse kinematics + PD control):

python ./lw_benchhub/scripts/teleop/replay_action_demo.py \
--dataset_file /path/to/your/dataset.hdf5 \
--replay_mode action
--enable_cameras

Features:

  • End-to-end action validation
  • Tests complete control pipeline
  • Evaluates task completion success

Replay Analysis

Both replay modes provide valuable insights:

  • Trajectory smoothness: Assess motion quality
  • Task success rate: Validate demonstration effectiveness
  • Control stability: Identify potential control issues
  • Data quality: Ensure consistent robot behavior

Reinforcement Learning

LW-BenchHub provides a complete RL pipeline integrated with SKRL. An RL run is defined by a YAML preset under configs/rl/skrl/ (e.g., g1_liftobj.yaml) that selects a robot, task, and scene. The environment itself is composed from a robot RL variant and a task RL config registered in lw_benchhub_rl/__init__.py.

Training a Policy

Launch RL training with the default preset:

bash train.sh   # default preset uses LiftObj (state variant)

The script sets GPU variables internally. To use a different preset/variant, edit train.sh or run the Python entry directly:

python ./lw_benchhub/scripts/rl/train.py \
--task_config lerobot_liftobj_state \
--headless \

Advanced Training Options

Use your own preset under configs/rl/skrl/ (e.g., my_robot_my_task.yaml):

_base_: rl_base
task: MyTask
robot: MyRLRobot
layout: AnyLayout
rl: MyRLConfig
num_envs: 128
usd_simplify: true # USD simplify enable
enable_cameras: false

Then train:

python ./lw_benchhub/scripts/rl/train.py --task_config my_robot_my_task --headless

RL configuration parameters

Below is a brief guide to the main parameters used by the RL presets and the agent configuration.

Environment preset (rl_base.yml)

remote_protocal: ipc         # restful or ipc
for_rl: true
concatenate_terms: false
disable_fabric: false # Disable fabric and use USD I/O operations
device: cuda:0

# Environment Configuration
num_envs: 10 # Number of environments to simulate
layout: robocasakitchen # Scene layout
sources: # Object source name
- objaverse
- lightwheel
- aigen_objs
object_projects: [] # Object project name
scene_backend: robocasa # Scene backend name
task_backend: robocasa # task backedn name
debug_assets: null # Debug assets name
robot: null # Robot cfg name
task: null # Task cfg name
rl: null # RL cfg name
variant: null # RL variant name, define when register the RL config
usd_simplify: false # USD simplify enable
enable_rendering_optimization: true # Enable optimization

# Asset and Export Paths
export_base_path: null # Export base path (replace with actual ASSET_PATH if defined in code)
robot_scale: 1.0 # Robot scale
first_person_view: false # Enable first-person view

# Reinforcement Learning Settings
video: false # Record videos during training
video_length: 200 # Length of recorded video (in steps)
video_interval: 2000 # Interval between video recordings (in steps)
seed: null # Seed used for the environment
distributed: false # Run training with multiple GPUs or nodes
max_iterations: null # Max RL policy training iterations

# ML Framework Settings
ml_framework: "torch" # The ML framework used for training
# Options: ["torch", "jax", "jax-numpy"]

algorithm: "PPO" # The RL algorithm used for training
# Options: ["AMP", "PPO", "IPPO", "MAPPO"]

# Checkpoint
checkpoint: null

# Play Settings
use_pretrained_checkpoint: false
real_time: false
enable_cameras: false
check_success: true
Key notes
  • Effective fabric usage is computed as use_fabric = not disable_fabric in the code.
  • num_envs trades off throughput vs. GPU memory.
  • algorithm selects which SKRL config key is used internally.

Task preset (e.g., g1_liftobj_state.yaml)

_base_: rl_base
task: LiftObj # Task name registered in lw_benchhub_rl
robot: G1-RL # Robot RL variant (action heads / EE frames)
layout: robocasakitchen-1-8 # Kitchen scene layout
rl: G1LiftObjStateRL # RL Config
num_envs: 1024 # Override of rl_base default
usd_simplify: true # Override for faster loading
enable_cameras: false # State variant does not require cameras (set true for Visual)
Key notes
  • The final environment is composed from the RL config registered in lw_benchhub_rl/__init__.py.
  • You can override any key from rl_base.yml in the task preset.

Agent configuration (SKRL) — example fields

models:
separate: false # Single shared backbone for policy/value (false) or separate nets (true)
policy:
class: GaussianMixin # Stochastic Gaussian policy (continuous actions)
clip_log_std: true # Clamp log_std to [min_log_std, max_log_std]
min_log_std: -20.0
max_log_std: 2.0
initial_log_std: 0.0
network:
- name: net
input: STATES # Input tensor spec (auto: STATES/ACTIONS)
layers: [512, 256, 128, 64]
activations: elu
output: ACTIONS
value:
class: DeterministicMixin # Scalar value head
network:
- name: net
input: STATES
layers: [512, 256, 128, 64]
activations: elu
output: ONE

memory:
class: RandomMemory # Rollout buffer backend
memory_size: -1 # Auto = agent.rollouts

agent:
class: PPO
rollouts: 24 # Steps per environment before update (episode fragments)
learning_epochs: 8 # Optimization epochs per update
mini_batches: 4 # Gradient batches per epoch
discount_factor: 0.99 # Gamma
lambda: 0.95 # GAE lambda
learning_rate: 1.0e-04
learning_rate_scheduler: KLAdaptiveLR
learning_rate_scheduler_kwargs:
kl_threshold: 0.01 # Target KL for LR adaptation
state_preprocessor: RunningStandardScaler
value_preprocessor: RunningStandardScaler
random_timesteps: 0 # Warmup steps with random actions
learning_starts: 0 # Start updates after N steps
grad_norm_clip: 1.0
ratio_clip: 0.2 # PPO epsilon clip
value_clip: 0.2
clip_predicted_values: true
entropy_loss_scale: 0.001 # Entropy bonus
value_loss_scale: 2.0
kl_threshold: 0.0 # Additional KL early stop (if used)
rewards_shaper_scale: 0.01 # Global reward scaling factor
time_limit_bootstrap: false # Bootstrap value at time limit
experiment:
directory: "lift_obj" # Log root under policy/skrl/logs/
experiment_name: "" # Optional suffix
write_interval: auto
checkpoint_interval: auto

trainer:
class: SequentialTrainer
timesteps: 72000 # Total environment steps (num_envs * steps)
environment_info: log
Practical tips
  • Effective batch size per update roughly scales with rollouts * num_envs (and sequence length if recurrent).
  • Increase entropy_loss_scale to encourage exploration; lower it for exploitation.
  • Use learning_rate_scheduler with kl_threshold to stabilize training automatically.

Additional RL Configuration (rl field) - Example

@rl_on(task=task_A)
@rl_on(task=task_B)
@rl_on(embodiment=robot_A)
@rl_on(embodiment=robot_B)
class TaskRobotRL(LwRL):
# Example RL configuration for reward and observation tailored to a series of tasks (task_A, task_B, ...)
# and a series of robots (robot_A, robot_B, ...).
def __init__(self):
super().__init__()
self.policy_observation_cfg = TaskRobotPolicyObsCfg()
self.rewards_cfg = RewardsCfg()
self.events_cfg = EventCfg()

def setup_env_config(self, orchestrator):
# Set specific reward/observation configuration, link and eef names for your RL task
pass

def modify_env_cfg(self, env_cfg):
# Set special environment configuration variables, such as the end-effector (ee) frame cfg
pass

GPU Configuration

The training script automatically handles GPU allocation:

  • Single GPU: Uses one GPU for both environment and policy
  • Multi-GPU: Distributes environment and policy across different GPUs
  • CPU fallback: Supports CPU-only training for development

Policy Evaluation

Test your trained policies using the evaluation script:

bash eval.sh   # uses task_config=lerobot_liftobj_state_play by default

Or run custom evaluation:

python ./lw_benchhub/scripts/rl/play.py \
--task_config lerobot_liftobj_state_play \
# --enable_camera \
# --headless

Training Monitoring

  • Tensorboard: Monitor training progress and metrics
  • Video logging: Automatic recording of policy performance
  • Checkpoint saving: Regular model checkpointing for recovery
  • Evaluation episodes: Periodic policy evaluation during training

Next Steps

Once you're familiar with the basic workflow, explore these advanced features:

  1. Custom Tasks: Create new manipulation tasks by extending the base task classes
  2. Robot Integration: Add support for new robot configurations
  3. Multi-Modal Learning: Combine teleoperation data with RL training
  4. Scene Customization: Design custom kitchen layouts and scenarios
  5. Algorithm Integration: Implement new RL algorithms using the SKRL framework

For detailed information about each component, refer to the comprehensive documentation and API references.

Troubleshooting

Common Issues

  • GPU Memory: Reduce num_envs if encountering memory issues
  • Control Instability: Adjust sensitivity and step_hz parameters
  • Rendering Issues: Check Isaac Lab installation and graphics drivers
  • Data Collection: Ensure proper record: true and valid dataset_file path

Performance Optimization

  • Use --headless for faster training without visualization
  • Enable --enable_camera only when needed for vision-based policies
  • Optimize step_hz based on your hardware capabilities
  • Use appropriate device setting (cuda/cpu) for your setup

This framework provides a solid foundation for kitchen manipulation research, supporting everything from basic teleoperation to advanced reinforcement learning with multiple robot platforms.